Storage Fit Learning with Unlabeled Data

نویسندگان

Bo-Jian Hou

Lijun Zhang

Zhi-Hua Zhou

چکیده

By using abundant unlabeled data, semi-supervised learning approaches have been found useful in various tasks. Existing approaches, however, neglect the fact that the storage available for the learning process is different under different situations, and thus, the learning approaches should be flexible subject to the storage budget limit. In this paper, we focus on graph-based semi-supervised learning and propose two storage fit learning approaches which can adjust their behaviors to different storage budgets. Specifically, we utilize techniques of low-rank matrix approximation to find a low-rank approximator of the similarity matrix to meet the storage budget. The first approach is based on stochastic optimization, which is an iterative approach that converges to the optimal low-rank approximator globally. The second approach is based on Nyström method, which can find a good low-rank approximator efficiently and is suitable for real-time applications. Experiments show that the proposed methods can fit adaptively different storage budgets and obtain good performances in different scenarios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimate Unlabeled-Data-Distribution for Semi-supervised PU Learning

Traditional supervised classifiers use only labeled data (features/label pairs) as the training set, while the unlabeled data is used as the testing set. In practice, it is often the case that the labeled data is hard to obtain and the unlabeled data contains the instances that belong to the predefined class beyond the labeled data categories. This problem has been widely studied in recent year...

متن کامل

What causes category-shifting in human semi-supervised learning?

In a categorization task involving both labeled and unlabeled data, it has been shown that humans make use of the underlying distribution of the unlabeled examples. It has also been shown that humans are sensitive to shifts in this distribution, and will change predicted classifications based on these shifts. It is not immediately obvious what causes these shifts – what specific properties of t...

متن کامل

Multi-Label Classification with Unlabeled Data: An Inductive Approach

The problem of multi-label classification has attracted great interests in the last decade. Multi-label classification refers to the problems where an example that is represented by a single instance can be assigned tomore than one category. Until now, most of the researches on multi-label classification have focused on supervised settings whose assumption is that large amount of labeled traini...

متن کامل

Efficient Computation and Model Selection in Semi-Supervised Learning

Traditional learning algorithm uses only labeled data for training. However, labeled examples are often difficult or time consuming to obtain since they require substantial labeling efforts from humans. On the other hand, unlabeled data are often relatively easy to collect. Semi-supervised learning addresses this problem by using large quantities of unlabeled data with the labeled data to build...

متن کامل

Data Dependant Learners Ensemble Pruning

Ensemble learning aims at combining several slightly different learners to construct stronger learner. Ensemble of a well selected subset of learners would outperform than ensemble of all. However, the well studied accuracy / diversity ensemble pruning framework would lead to over fit of training data, which results a target learner of relatively low generalization ability. We propose to ensemb...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Storage Fit Learning with Unlabeled Data

نویسندگان

چکیده

منابع مشابه

Estimate Unlabeled-Data-Distribution for Semi-supervised PU Learning

What causes category-shifting in human semi-supervised learning?

Multi-Label Classification with Unlabeled Data: An Inductive Approach

Efficient Computation and Model Selection in Semi-Supervised Learning

Data Dependant Learners Ensemble Pruning

عنوان ژورنال:

اشتراک گذاری